Disk usage problem with BLOBs using libpq - Mailing list pgsql-interfaces

From John Khouri
Subject Disk usage problem with BLOBs using libpq
Date
Msg-id a05200f04bb56e3b071f5@[130.107.14.36]
Whole thread Raw
List pgsql-interfaces
Hi,

I'm experiencing a disk usage "lossage" when rewriting large objects (BLOBs) multiple times.  I'm using the libpq libraries, posgresql 7.3.2 on FreeBSD-4.5 (default settings on installation), but I don't believe the version or OS are factors here.

In the test case below, a 5k-byte BLOB is created once and rewritten n-times.  Given these multiple same-size rewrites, I would expect the disk usage to increase only slightly due to (hopefully small) overhead needed for the transaction.  However I am seeing a substantial increase in disk usage that is linear with respect to the number of rewrites (in /usr/local/pgsql/data/base/16976):

- 10000 rewrites increases the disk space by 11.5 Mbytes
- 50000 rewrites increases the disk space by 57.0 Mbytes
  and so on

Is this a postgres "bug" or is there a way to minimize this "lost" disk space by adjusting a configuration setting or manipulating the transaction?  (Note: in our actual application with multiple BLOBS summing 40 Mbytes, the resulting disk space consumed in ...pgsql/data/base... is over 4 Gbytes).

Any help is appreciated.

Thanks,
JK




#include <stdio.h>
#include "libpq-fe.h"
#include "libpq/libpq-fs.h"
                    
#define BLOB_SIZE 5000

static void writeBlob (PGconn* conn, Oid lobjId)
{
 
char buf[BLOB_SIZE];
 
int  lobj_fd;
 
int  count;
 
int  status;

  lobj_fd = lo_open (conn, lobjId, INV_WRITE);
                    
  count = lo_write (conn, lobj_fd, buf, BLOB_SIZE);
 
if (count < BLOB_SIZE)
    fprintf (stderr,
"error while writing large object\n");
                    
  status = lo_close (conn, lobj_fd);
 
if (status != 0)
    fprintf (stderr,
"lo_close error %d\n", status);
}


int main (int argc, char **argv)
{
  Oid         lobjId;
  PGconn*     conn;
  PGresult*   res;
 
int         num_rewrites;
 
int         j;
                    
  conn = PQconnectdb ("dbname='test_db' user='postgres'");
                    
 
if (PQstatus(conn) == CONNECTION_BAD) {
    fprintf(stderr, "Connection to database 'test_db' failed.\n");
    fprintf(stderr, "%s", PQerrorMessage(conn));
   
return (1);
  }
                    
  num_rewrites = atoi (argv[
1]);

  res = PQexec (conn,
"begin");
  PQclear(res);
            
 
/* create initial blob */

  lobjId = lo_creat (conn, INV_READ | INV_WRITE);

 
if (lobjId == 0)
    fprintf (stderr,
"error creating large object\n");
                    
  writeBlob (conn, lobjId);

  res = PQexec (conn,
"end");
  PQclear (res);
 
 
/* rewrite blob "num_rewrites" times */

 
for (j = 0; j < num_rewrites; j++) {
      res = PQexec (conn,
"begin");
      PQclear(res);
      writeBlob (conn, lobjId);              
      res = PQexec (conn,
"end");
      PQclear (res);
  }
 
  PQfinish (conn);
 
return (0);
}

pgsql-interfaces by date:

Previous
From: "D'Arcy J.M. Cain"
Date:
Subject: Re: Seeking PyGreSQL help
Next
From: Bruce Momjian
Date:
Subject: Re: ecpg problem in 7.4 beta